Convolutional fusion network for monaural speech enhancement

نویسندگان

چکیده

Convolutional neural network (CNN) based methods, such as the convolutional encoder–decoder network, offer state-of-the-art results in monaural speech enhancement. In conventional large kernel size is often used to enhance model capacity, which, however, low parameter efficiency. This could be addressed by using group convolution, AlexNet, where convolutions are performed parallel each layer, before their outputs concatenated. However, with simple concatenation, inter-channel dependency information may lost. To address this, Shuffle re-arranges of concatenating them, taking part whole input sequence convolution. this work, we propose a new fusion (CFN) for enhancement improving performance, dependency, reuse and First, unit (GCFU) consisting standard depth-wise separable CNN reconstruct signal. Second, (full information) fed simultaneously two convolution networks parallel, re-arranged (shuffled) then concatenated, order exploit within network. Third, intra skip connection mechanism connect different layers inside encoder well decoder further improve performance. Extensive experiments show improved performance proposed method compared three recent baseline methods.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement

This paper proposes a signal-to-noise-ratio (SNR) aware convolutional neural network (CNN) model for speech enhancement (SE). Because the CNN model can deal with local temporal-spectral structures of speech signals, it can effectively disentangle the speech and noise signals given the noisy speech signals. In order to enhance the generalization capability and accuracy, we propose two SNR-aware ...

متن کامل

A Fully Convolutional Neural Network for Speech Enhancement

In hearing aids, the presence of babble noise degrades hearing intelligibility of human speech greatly. However, removing the babble without creating artifacts in human speech is a challenging task in a low SNR environment. Here, we sought to solve the problem by finding a ‘mapping’ between noisy speech spectra and clean speech spectra via supervised learning. Specifically, we propose using ful...

متن کامل

Phoneme-Dependent NMF for Speech Enhancement in Monaural Mixtures

The problem of separating speech signals out of monaural mixtures (with other non-speech or speech signals) has become increasingly popular in recent times. Among the various solutions proposed, the most popular methods are based on compositional models such as non-negative matrix factorization (NMF) and latent variable models. Although these techniques are highly effective they largely ignore ...

متن کامل

Convolutional Gating Network for Object Tracking

Object tracking through multiple cameras is a popular research topic in security and surveillance systems especially when human objects are the target. However, occlusion is one of the challenging problems for the tracking process. This paper proposes a multiple-camera-based cooperative tracking method to overcome the occlusion problem.  The paper presents a new model for combining convolutiona...

متن کامل

Monaural Speech Separation

Monaural speech separation has been studied in previous systems that incorporate auditory scene analysis principles. A major problem for these systems is their inability to deal with speech in the highfrequency range. Psychoacoustic evidence suggests that different perceptual mechanisms are involved in handling resolved and unresolved harmonics. Motivated by this, we propose a model for monaura...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Neural Networks

سال: 2021

ISSN: ['1879-2782', '0893-6080']

DOI: https://doi.org/10.1016/j.neunet.2021.05.017